A global averaging method for dynamic time warping, with applications to clustering
نویسندگان
چکیده
Mining sequential data is an old topic that has been revived in the last decade, due to the increasing availability of sequential datasets. Most works in this field are centred on the definition and use of a distance (or, at least, a similarity measure) between sequences of elements. A measure called Dynamic Time Warping (DTW) seems to be currently the most relevant for a large panel of applications. This article is about the use of DTW in data mining algorithms, and focuses on the computation of an average of a set of sequences. Averaging is an essential tool for the analysis of data. For example, the K-means clustering algorithm repeatedly computes such an average, and needs to provide a description of the clusters it forms. Averaging is here a crucial step, which must be sound in order to make algorithms work accurately. When dealing with sequences, especially when sequences are compared with DTW, averaging is not a trivial task. Starting with existing techniques developed around DTW, the article suggests an analysis framework to classify averaging techniques. It then proceeds to study the two major questions lifted by the framework. First, we develop a global technique for averaging a set of sequences. This technique is original in that it avoids using iterative pairwise averaging. It is thus insensitive to ordering effects. Second, we describe a new strategy to reduce the length of the resulting average sequence. This has a favourable impact on performance, but also on the relevance of the results. Both aspects are evaluated on standard datasets, and the evaluation shows that they compare favourably with existing methods. The article ends by describing the use of averaging in clustering. The last section also introduces a new application domain, namely the analysis of satellite image time series, where data mining techniques provide an original approach.
منابع مشابه
Inaccuracies of Shape Averaging Method Using Dynamic Time Warping for Time Series Data
Shape averaging or signal averaging of time series data is one of the prevalent subroutines in data mining tasks, where Dynamic Time Warping distance measure (DTW) is known to work exceptionally well with these time series data, and has long been demonstrated in various data mining tasks involving shape similarity among various domains. More specifically, in some tasks such as query refinement,...
متن کاملA Hybrid Time Series Clustering Method Based on Fuzzy C-Means Algorithm: An Agreement Based Clustering Approach
In recent years, the advancement of information gathering technologies such as GPS and GSM networks have led to huge complex datasets such as time series and trajectories. As a result it is essential to use appropriate methods to analyze the produced large raw datasets. Extracting useful information from large data sets has always been one of the most important challenges in different sciences,...
متن کاملProgressive and Iterative Approaches for Time Series Averaging
Averaging a set of time series is a major topic for many temporal data mining tasks as summarization, extracting prototype or clustering. Time series averaging should deal with the tricky multiple temporal alignment problem; a still challenging issue in various domains. This work compares the major progressive and iterative averaging time series methods under dynamic time warping (dtw).
متن کاملTime-series averaging using constrained dynamic time warping with tolerance
In this paper, we propose an innovative averaging of a set of time-series based on the Dynamic Time Warping (DTW). The DTW is widely used in data mining since it provides not only a similarity measure, but also a temporal alignment of time-series. However, its use is often restricted to the case of a pair of signals. In this paper, we propose to extend its application to a set of signals by pro...
متن کاملOptimal Current Meter Placement for Accurate Fault Location Purpose using Dynamic Time Warping
This paper presents a fault location technique for transmission lines with minimum current measurement. This algorithm investigates proper current ratios for fault location problem based on thevenin theory in faulty power networks and calculation of short circuit currents in each branch. These current ratios are extracted regarding lowest sensitivity on thevenin impedance variations of the netw...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Pattern Recognition
دوره 44 شماره
صفحات -
تاریخ انتشار 2011